feat: OpenAI Ingredient Parsing #3581

michael-genson · 2024-05-10T21:51:01Z

What type of PR is this?

(REQUIRED)

feature

What this PR does / why we need it:

(REQUIRED)

This PR opens the door to implementing OpenAI in Mealie, and implements a new OpenAI ingredient parser. At a high level, this adds an OpenAI service that manages stored prompts and data injection to call the OpenAI API and receive a JSON response (which we then parse into a Pydantic model).

To enable OpenAI features, users need to include their OpenAI API key in the backend config (using the OPENAI_API_KEY env var). There are a few other configuration options to tweak performance vs cost (since the API isn't free).

Since OpenAI configuration is done via environment variables, this doesn't require any DB migrations.

The way this works is we have stored prompts which get sent to OpenAI to instruct it on what to do, i.e. "You are a bot designed to parse ingredients for recipes" (the actual prompt is much longer and goes into far more detail). It then sends a JSON list of inputs as the user message for it to process.

The OpenAI API supports returning its response in JSON format, which is perfect for FastAPI/Pydantic validation. I used Pydantic's BaseModel.model_dump_json() to inject the expected response schema into the prompt, which makes GPT always respond in a parsable format.

From there, implementing an interface is simple:

send inputs to OpenAI
receive predictable JSON string
parse into Pydantic model
push through whatever existing service we have)

Our OpenAI service handles the prompt injection, additional data injection (see below), and API handling, you just need to provide it the data and a description of how to use the data.

For the parser I opted to serialize our unit store and send it along with the rest of the prompt. This gives GPT some training data to say "you should expect to see these units". Originally I also included foods, but it didn't seem to help much at all (and adding the entire food store racks up API costs. This is configurable in the env settings: if you want to reduce costs, you can skip the optional data injection.

The OpenAI API isn't very fast when the responses are long. I took a bunch of measures to optimize this, but you can also split the ingredients into chunks and send multiple async requests (one for each chunk). This speeds up the parse time considerably, but costs more. The worker count is configurable in the env settings.

This PR also adds some QoL features on the frontend for parsing ingredients. Namely:

The last parser you chose is stored in user settings (so if you like the OpenAI parser you don't have to keep clicking on it)
While the backend is parsing ingredients, parse/save are disabled, and a loading animation runs (since OpenAI parsing takes some time, this was necessary, although it's nice for the other parsers too)

I've also hid the OpenAI ingredient parser if OpenAI isn't enabled (i.e. you haven't provided an API key).

Which issue(s) this PR fixes:

(REQUIRED)

N/A, though it has been discussed on and off

Special notes for your reviewer:

(fill-in or delete this section)

The prompts (this one and future ones) will likely go through a bunch of iterations before we're in that "sweet spot" of how to get the best results out of GPT. Ideally in the future it will need to be optimized for newer models (we may even decide to have different prompts for different models), but this is why I specifically included an env var for the OpenAI model to use (so that we aren't forced to keep up with the rapidly evolving AI space), sort of like pinning a package version.

This opens up some exciting possibilities in the future, such as importing strange recipe sources (unstructured data, OCR, etc.).

Testing

(fill-in or delete this section)

You need an OpenAI API key to properly test this, but I added a mocked test just to confirm it works fine as long as we get data from OpenAI.

docs/docs/documentation/getting-started/installation/backend-config.md

boc-the-git · 2024-05-13T09:54:37Z

I've only skimmed it, this is not a definitive review!

Is it possible to make the endpoint configurable? I've not across all the details but I believe a lot of the self hosted LLM projects have "OpenAI compatible" endpoints. It would be great if we can easily support those as well, particularly given Mealie is in the the same self hosting space.

(Obviously, I'd be happy for any change here to be a subsequent PR)

michael-genson · 2024-05-13T13:50:12Z

It might be possible, but it would require a lot of extra work for a few reasons, which I've stayed away from in this PR:

OpenAI specifically supports a JSON response, which to my knowledge is unique to OpenAI, and we rely heavily on this
I used the OpenAI API library/client because it covers a lot of edge-cases, but it's tied to OpenAI. We'd have to switch to LangChain directly which would be a more significant lift, and would likely share very little code with the OpenAI implementation

jaasonw · 2024-05-14T18:06:39Z

Can you elaborate on what you mean by "JSON response unique to OpenAI"?

I believe what boc is saying is projects like ollama have an OpenAI-compatible API, allowing it to act as a drop-in replacement to the endpoint in the OpenAI library

michael-genson · 2024-05-14T19:11:31Z

Can you elaborate on what you mean by "JSON response unique to OpenAI"?

Specifically OpenAI's JSON mode: https://platform.openai.com/docs/guides/text-generation/json-mode

projects like ollama have an OpenAI-compatible API

I didn't realize that works even with the OpenAI library, that's super nice. Looks like we can just make the OpenAI base URL customizable and enable this. What I wanted to avoid was writing a custom client to interact with OpenAI (since it's a lot to maintain and really out of scope of Mealie)

eikaramba · 2024-05-15T19:41:15Z

if i understand this correctly the input is still the meta data from a website in a open recipe format. right? because i am parsing a websites html with gpt as not every website has the recipe in a structured format. i stumbled across multiple examples where the recipe is only available in the text/html and chatgpt needs to intelligently parse it to a json. works very good actually

michael-genson · 2024-05-15T19:45:09Z

the input is still the meta data from a website in a open recipe format

Correct, this PR is not for scraping websites and generating recipes. This is for recipes that have already been imported, but their ingredients are not yet parsed.

However, I do have plans to support alternative import methods using OpenAI, building off of the foundation of this PR. Theoretically we can fall back to parsing a website with OpenAI when recipe metadata isn't available.

michael-genson · 2024-05-15T19:47:06Z

There are a few different discussions that I think are great ways to apply this to other areas of Mealie later down the road

felixschndr · 2024-05-16T12:46:02Z

Does this required a paid OpenAI account? The default model is gpt-4o, however when setting this to something like gpt the calls should be free, right?

michael-genson · 2024-05-16T14:34:46Z

You may use any LLM that has an OpenAI-compatible API. For instance, see ollama posted above. You just need to specify your own base_url, api_key, and model name for your model.

I've only tested with gpt-4 (and its variants) so I can only confirm that those work, however it's fully configurable per-instance. I will say that with gpt-4 you blow through the free tier extremely quickly. I've built in some measures to reduce costs as well as give some configurability to trade off speed vs cost. With gpt-4o it seems to cost 5-10 cents per parsed recipe (with 2 workers and ~10 ingredients)

jaasonw · 2024-05-16T20:54:51Z

With gpt-4o it seems to cost 5-10 cents per parsed recipe (with 2 workers and ~10 ingredients)

Is there a reason to prefer a more powerful and more expensive model than 3.5-turbo ($0.50/1M tokens) as the default?

michael-genson · 2024-05-16T22:05:38Z

Short answer: No not really, but the default hardly matters when it doesn't work out of the box anyway; at a minimum you need to supply an API key, so there's nothing stopping you from also setting the model.

Longer answer: I've had a lot more success with GPT-4 when it comes to anything other than conversational interaction. GPT-3.5 is also a lot more moody when it comes to following prompts. GPT-4 is also much better at parsing non-english languages, which is particularly important for a parser that needs to understand grammar

boc-the-git

LGTM. I particularly like the introduction of parserLoading.

Let's get this in front of people and see what feedback comes through!

michael-genson added 25 commits May 9, 2024 22:22

added openai settings

606bfc9

added WIP openai service

64f5a42

added openai ingredient parser backend

f184cd2

improve confidence calculation

19a98b4

explicit definition for input string

74ef0f5

docs

009f0bf

added openai parser to frontend

ec23dfb

match parser response with food/unit store

abc20a6

add more parsing instructions to prompt

8ef9c80

remember last chosen parser

50ed3e7

refactored and added more openai customization/optimization

54676fc

docs

8fa252d

hide openai parser if openai is disabled

2e0bffd

lowered default workers to 2

e03c241

added loading animation

70c6c39

additional docs

5b5832f

removed food injection

f475f6b

tweaked prompt

334d1f8

add openai parse test

32d5b89

typos

b26e775

prompt tweaks

798ff34

lint

5943402

fixed bad import

5607ca1

Merge branch 'mealie-next' into feat/open-ai-ingredient-parsing

b3a3816

more lint

5796c1f

michael-genson force-pushed the feat/open-ai-ingredient-parsing branch from a302c75 to 5796c1f Compare May 10, 2024 22:12

michael-genson added 2 commits May 11, 2024 17:22

prompt tweak

73e5332

Merge branch 'mealie-next' into feat/open-ai-ingredient-parsing

5611189

boc-the-git reviewed May 13, 2024

View reviewed changes

docs/docs/documentation/getting-started/installation/backend-config.md Show resolved Hide resolved

changed default model to gpt-4o

dbfa518

michael-genson added 3 commits May 14, 2024 19:18

added configurable base URL

36c991c

add openAI status to site settings

d1e977d

added docstring

eec111d

Merge branch 'mealie-next' into feat/open-ai-ingredient-parsing

a0dcf99

boc-the-git approved these changes May 22, 2024

View reviewed changes

boc-the-git enabled auto-merge (squash) May 22, 2024 09:37

Merge branch 'mealie-next' into feat/open-ai-ingredient-parsing

5f253bc

boc-the-git merged commit 5c57b3d into mealie-recipes:mealie-next May 22, 2024
10 checks passed

michael-genson deleted the feat/open-ai-ingredient-parsing branch May 22, 2024 14:25

boc-the-git pushed a commit to boc-the-git/mealie that referenced this pull request May 23, 2024

feat: OpenAI Ingredient Parsing (mealie-recipes#3581)

db46086

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: OpenAI Ingredient Parsing #3581

feat: OpenAI Ingredient Parsing #3581

michael-genson commented May 10, 2024 •

edited

Loading

boc-the-git commented May 13, 2024

michael-genson commented May 13, 2024

jaasonw commented May 14, 2024

michael-genson commented May 14, 2024

eikaramba commented May 15, 2024

michael-genson commented May 15, 2024 •

edited

Loading

michael-genson commented May 15, 2024

felixschndr commented May 16, 2024

michael-genson commented May 16, 2024 •

edited

Loading

jaasonw commented May 16, 2024

michael-genson commented May 16, 2024

boc-the-git left a comment

feat: OpenAI Ingredient Parsing #3581

feat: OpenAI Ingredient Parsing #3581

Conversation

michael-genson commented May 10, 2024 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Testing

boc-the-git commented May 13, 2024

michael-genson commented May 13, 2024

jaasonw commented May 14, 2024

michael-genson commented May 14, 2024

eikaramba commented May 15, 2024

michael-genson commented May 15, 2024 • edited Loading

michael-genson commented May 15, 2024

felixschndr commented May 16, 2024

michael-genson commented May 16, 2024 • edited Loading

jaasonw commented May 16, 2024

michael-genson commented May 16, 2024

boc-the-git left a comment

Choose a reason for hiding this comment

michael-genson commented May 10, 2024 •

edited

Loading

michael-genson commented May 15, 2024 •

edited

Loading

michael-genson commented May 16, 2024 •

edited

Loading